A blackboard approach towards integrated Farsi OCR system
Identifieur interne : 000B00 ( Main/Exploration ); précédent : 000A99; suivant : 000B01A blackboard approach towards integrated Farsi OCR system
Auteurs : Hossein Khosravi [Iran] ; Ehsanollah Kabir [Iran]Source :
- International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2009.
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Classification.
English descriptors
- KwdEn :
Abstract
An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000193
- to stream PascalFrancis, to step Curation: 000584
- to stream PascalFrancis, to step Checkpoint: 000217
- to stream Main, to step Merge: 000B11
- to stream Main, to step Curation: 000B00
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A blackboard approach towards integrated Farsi OCR system</title>
<author><name sortKey="Khosravi, Hossein" sort="Khosravi, Hossein" uniqKey="Khosravi H" first="Hossein" last="Khosravi">Hossein Khosravi</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Kabir, Ehsanollah" sort="Kabir, Ehsanollah" uniqKey="Kabir E" first="Ehsanollah" last="Kabir">Ehsanollah Kabir</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">10-0180822</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 10-0180822 INIST</idno>
<idno type="RBID">Pascal:10-0180822</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000193</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000584</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000217</idno>
<idno type="wicri:doubleKey">1433-2833:2009:Khosravi H:a:blackboard:approach</idno>
<idno type="wicri:Area/Main/Merge">000B11</idno>
<idno type="wicri:Area/Main/Curation">000B00</idno>
<idno type="wicri:Area/Main/Exploration">000B00</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A blackboard approach towards integrated Farsi OCR system</title>
<author><name sortKey="Khosravi, Hossein" sort="Khosravi, Hossein" uniqKey="Khosravi H" first="Hossein" last="Khosravi">Hossein Khosravi</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Kabir, Ehsanollah" sort="Kabir, Ehsanollah" uniqKey="Kabir E" first="Ehsanollah" last="Kabir">Ehsanollah Kabir</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Department of Electrical Engineering, Tarbiat Modarres University</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Character recognition</term>
<term>Classification</term>
<term>Information system</term>
<term>Optical character recognition</term>
<term>Pattern extraction</term>
<term>Probabilistic approach</term>
<term>Segmentation</term>
<term>Statistical analysis</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance caractère</term>
<term>Reconnaissance optique caractère</term>
<term>Texte</term>
<term>Système information</term>
<term>Classification</term>
<term>Analyse statistique</term>
<term>Approche probabiliste</term>
<term>Extraction forme</term>
<term>Segmentation</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">An integrated OCR system for Farsi text is proposed. The system uses information from several knowledge sources (KSs) and manages them in a blackboard approach. Some KSs like classifiers are acquired a priori through an offline training process while others like statistical features are extracted online while recognizing. An arbiter controls the interactions between the solution blackboard and KSs. The system has been tested on 20 real-life scanned documents with ten popular Farsi fonts and a recognition rate of 97.05% in word level and 99.03% in character level has been achieved.</div>
</front>
</TEI>
<affiliations><list><country><li>Iran</li>
</country>
</list>
<tree><country name="Iran"><noRegion><name sortKey="Khosravi, Hossein" sort="Khosravi, Hossein" uniqKey="Khosravi H" first="Hossein" last="Khosravi">Hossein Khosravi</name>
</noRegion>
<name sortKey="Kabir, Ehsanollah" sort="Kabir, Ehsanollah" uniqKey="Kabir E" first="Ehsanollah" last="Kabir">Ehsanollah Kabir</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000B00 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000B00 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:10-0180822 |texte= A blackboard approach towards integrated Farsi OCR system }}
This area was generated with Dilib version V0.6.32. |